Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Save Usenet group info in a text file?

15 views
Skip to first unread message

Mr. Mike

unread,
Jun 2, 2018, 11:50:33 AM6/2/18
to
Is there any way you can save the list of subjects from a Usenet
group as seen with Agent into a text file, or even an Excel file with
the various columns: Lines/Subject/Author/Date?

Fred

unread,
Jun 2, 2018, 4:34:03 PM6/2/18
to
There is a work around for that. Select all the messages with
downloaded bodies then File > Save message as... > Unix format. Then
you can use Textpad to FIND IN FILES for the subject; by searching the
saved files. Other text editors and search tools can do everything you
want, but you will have to learn them to extract the information you
want.

You can get Textpad at textpad.com

Arthur T.

unread,
Jun 3, 2018, 12:15:19 AM6/3/18
to
In Message-ID:<g4f5hd98mhqg7k82v...@4ax.com>,
This has been asked many times over the years. As noted, there
are kluges *if* the messages have bodies. Otherwise, you need other
software.

Snagit sometimes works. Other than by trying, I have no way to
know if it will in any particular instance. OTOH, I don't have the
latest version. It's not free software.

--
Arthur T. - ar23hur "at" pobox "dot" com

Steve Hayes

unread,
Jun 3, 2018, 1:06:50 AM6/3/18
to
On Sun, 03 Jun 2018 00:14:33 -0400, Arthur T. <art...@munged.invalid>
wrote:
Isn't this the kind of thing that AWK can do?


--
Steve Hayes
http://www.khanya.org.za/stevesig.htm
http://khanya.wordpress.com

Ralph Fox

unread,
Jun 3, 2018, 2:00:30 AM6/3/18
to
Here is how to save the message list as an HTML file with the columns
Lines/Subject/Author/Date.

This works whether or not the bodies have been downloaded.
This requires Agent 4.2 or later.

1. In the message list pane, select the messages
whose headers you want to save.

2. Use "File >> Import and Export >> Export NZB File"
to save those headers as an NZB file.

3. Copy and save the text between the squiggly lines below
as an HTML file "nzb2htm.htm".

4. Open the file "nzb2htm.htm" in Firefox or Chrome.

For this step, do not use IE or Edge.
(The HTML message list can be viewed in IE or Edge after it
has been converted, but IE and Edge do not have what it takes
to do the conversion from NZB to HTML.)

You will see a button labelled "Open NZB file:".

5. Click this button and select the NZB file which you saved in step 2.

If you have selected a valid NZB file, you will now see
another button labelled "Save HTM file:".

6. Click the button "Save HTM file" and save the HTML file with the
message list headers.

You can open this saved HTML message list file in any browser you like
including IE, Edge, Firefox, and Chrome.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~>% nzb2htm.htm %<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<!DOCTYPE html>
<html lang="en">
<!-- saved from url=(0014)about:internet -->
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" >
<meta http-equiv="X-UA-Compatible" content="IE=edge" >
<title>NZB to HTML message list</title>
<script type="text/javascript">
function escapeHtml(s){
return s.replace(/&/g, "&amp;").replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;").replace(/'/g, "&#039;") .replace(/\r/g, "&#x0D;").replace(/\n/g, "&#x0A;").replace(/\t/g, "&#x09;") ;
}
/* Convert Unix dates to local time zone dates, in the Date column of the HTML output */
function convertDates(newDocument) {
var myNodeList = newDocument.getElementsByTagName("td");
for ( var i = 0 ; i < myNodeList.length ; ++i ){
var cell = myNodeList[i];
if ( cell != null && cell.hasAttribute("x-unixtime") ){
var dt = new Date( 1000 * cell.getAttribute("x-unixtime") );
cell.innerHTML = escapeHtml( dt.toLocaleDateString() + " " + dt.toLocaleTimeString() );
cell.removeAttribute("x-unixtime");
}
}
}
/* Transform the user's NZB file to HTML */
function convertNZBtoHTML( nzbFileText, infile, outfile ) {
var nzbDocument = (new DOMParser()).parseFromString(nzbFileText,"text/xml");
var xslStylesheetText = document.getElementById("xslt2html").textContent;
var xslStylesheet = (new DOMParser()).parseFromString(xslStylesheetText,"text/xml");
var myXSLTProcessor = new XSLTProcessor();
myXSLTProcessor.importStylesheet(xslStylesheet);
var newDocument = myXSLTProcessor.transformToDocument(nzbDocument);
if ( newDocument == null ){
/* MS-Edge pretends to support transformToDocument but fails when you need it */
document.getElementById("browserwarn").style.display="block";
document.getElementById("browserok").style.display = "none" ;
return;
}
if ( (newDocument.documentElement.nodeName.toLowerCase() != 'html') ||
(newDocument.documentElement.querySelector('body > div > table > colgroup > col') == null ) ) {
/* The input file is not a valid NZB file */
document.getElementById("badfilename").innerHTML = escapeHtml(infile);
document.getElementById("badfile").style.display = "block" ;
document.getElementById("dataurldiv").style.display = "none" ;
return;
} else {
document.getElementById("badfile").style.display = "none" ;
}
/* Convert the message dates in the output from Unix time integers to readable format */
convertDates(newDocument);
/* Let the user download the HTML output. Displaying on screen no work well in Chrome. */
var newDocAsString = "<!DOCTYPE html>\r\n<!-- saved from url=(0014)about:internet -->\r\n" + newDocument.documentElement.outerHTML ;
var dataUrl = "data:text/html;," + newDocAsString.replace(/%/g, "%25").replace(/\x0D/g, "%0D").replace(/\x0A/g, "%0A").replace(/[ \x09]/g, "%20").replace(/[\x00-\x1F]/g, "").replace(/#/g, "%23") ;
document.getElementById("dataurllink").setAttribute("download", outfile);
document.getElementById("dataurllink").href = dataUrl ;
document.getElementById("dataurldiv").style.display = "block" ;
}
/* Process the input file. Triggered when the user chooses a file to process. */
function processInputFile(files){
if ( files.length == 0 ){
return;
}
var infilename = files[0].name ;
var outfilename = infilename.replace(/\.nzb$/i,"") + ".htm";
var reader = new FileReader();
reader.onload = function(){
convertNZBtoHTML( this.result, infilename, outfilename );
}
reader.readAsText(files[0], "utf-8");
}
function resetFileInput(){
document.getElementById("fileform").reset();
}
function initialize(){
resetFileInput();
if ( window.FileReader && window.FileList && window.DOMParser && window.DOMParser.prototype.parseFromString && window.XSLTProcessor ){
document.getElementById("browserok").style.display="block";
}
else {
document.getElementById("browserwarn").style.display="block";
}
}
</script>
</head>
<body onload="initialize()">
<h1>NZB to HTML message list</h1>
<noscript>
<hr />
<h2>JavaScript required</h2>
<p>This browser-based application requires JavaScript.</p>
<p>To use this application, enable JavaScript and then reload this page.</p>
</noscript>
<div id="browserwarn" style="display: none;">
<hr />
<h2>Unsupported browser or version</h2>
Suggest you use one of the following desktop browsers:
<ul>
<li>Firefox (Mozilla Firefox)</li>
<li>Chrome (Google Chrome)</li>
</ul>
</div>
<div id="browserok" style="display: none;">
<form id="fileform">
<label>Open NZB file: <input type="file" id="fileinput" accept=".nzb" onchange="processInputFile(this.files)" /></label>
</form>
<hr />
<div id="badfile" style="display: none;">
<h2 style="color: #FF0000;">Not a valid NZB file: <i id="badfilename"> </i></h2>
</div>
<div id="dataurldiv" style="display: none;">
<!--
<label>Save HTM file: <a id="dataurllink" href="" download="Headers.htm" type="text/html"
style="-moz-appearance: button; -webkit-appearance: button; appearance: button; text-decoration: none; color: initial;">Download...</a></label> --><!-- TODO 'appearance' removed from CSS3 -->
<label>Save HTM file: <a id="dataurllink" href="" download="Headers.htm" type="text/html"
style="text-decoration: none;"><button type="button">Download...</button></a></label>
</div>
</div>
<script id="xslt2html" type="text/xml" style="display:none;">
<xsl:stylesheet version="1.0" xmlns:nzbns="http://www.newzbin.com/DTD/2003/nzb" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html" version="4.0" encoding="utf-8" />
<xsl:template match="/nzbns:nzb">
<xsl:element name="html">
<xsl:element name="head">
<xsl:element name="title">Message Headers</xsl:element>
</xsl:element>
<xsl:element name="body">
<div>
<table border="0" width="100%">
<colgroup>
<col width="0%" />
<col width="66%" />
<col width="33%" />
<col width="0%" />
</colgroup>
<thead style="display: table-header-group;">
<tr>
<th style="border-bottom: 2px solid #808080;">Lines</th>
<th style="border-bottom: 2px solid #808080; text-align: left;">Subject</th>
<th style="border-bottom: 2px solid #808080; text-align: left;">Author</th>
<th style="border-bottom: 2px solid #808080; text-align: left;">Date</th>
</tr>
</thead>
<tbody>
<xsl:for-each select="nzbns:file">
<tr style="vertical-align: top;">
<xsl:variable name="agentguessbytes" select="sum(nzbns:segments/nzbns:segment/@bytes)" />
<xsl:variable name="countsegments" select="count(nzbns:segments/nzbns:segment)" />
<xsl:choose>
<xsl:when test="$agentguessbytes &lt;= 400 * $countsegments">
<td style="text-align: right;"><xsl:value-of select="round($agentguessbytes div 40)"/></td>
</xsl:when>
<xsl:otherwise>
<td style="text-align: right;"><xsl:value-of select="floor(10 * $countsegments + ($agentguessbytes - 400 * $countsegments) div 128)"/></td>
</xsl:otherwise>
</xsl:choose>
<td style="padding-left: 0.5em;"><xsl:value-of select="@subject"/></td>
<td style="padding-left: 0.5em;"><xsl:value-of select="@poster"/></td>
<td style="padding-left: 0.5em; white-space: nowrap;" x-unixtime="{@date}"></td>
</tr>
</xsl:for-each>
</tbody>
</table>
</div>
</xsl:element>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
</script>
</body>
</html>

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~>% nzb2htm.htm %<~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


--
Kind regards
Ralph

Mandy Liefbowitz

unread,
Jun 3, 2018, 1:39:34 PM6/3/18
to
On Sun, 03 Jun 2018 18:00:17 +1200, Ralph Fox <-rf-nz-@-.invalid>
wrote:

>On Sat, 02 Jun 2018 08:50:23 -0700, Mr. Mike wrote:
>
>> Is there any way you can save the list of subjects from a Usenet
>> group as seen with Agent into a text file, or even an Excel file with
>> the various columns: Lines/Subject/Author/Date?
>
>
>Here is how to save the message list as an HTML file with the columns
>Lines/Subject/Author/Date.

<snipped good stuff>

Hey! Way cool, thank you.

That is ever so neat.

I can even figure out how to add other columns to the data should I
wish to, or how to not display the "lines" column as I see no real use
for that one.
I could even move the columns around in the HTML. That could be very
useful.

Next question, and I'm sure this one has been asked before, is there
a way to set those lines of HTML as active links to the individual
saved messages in a folder, directory or archive file? With threading?
An *easy* way that means I don't have to do any work? I suspect it's
within my abilities given the massive help you have already provided
but it's far too hot here to do HTML programming if you have already
done it all for us.

Thank you, you are a very nice, helpful person,
Mand.

Mr. Mike

unread,
Jun 3, 2018, 2:40:16 PM6/3/18
to
Of course, I forgot to mention a couple of things, duh!

These listings with Status/Lines/Subject/Author/Date I'm seeing in
Agent are for megajoined binary files which are not downloaded yet. So
the listings you see on screen are the "joined" listings for these
files.

If you use the above method with Firefox (which works very well,
thanks) you end up with huge lists of all the individual parts of the
files, i.e., part01.rar, etc., and up into the hundreds for individual
files.

The resulting list as viewed in the browser can be copied and pasted,
but then you have to find some way to filter out the extraneous
material so you end up with a list which just contains one each of the
various file names which could then be sorted alphabetically.

There are ways to do this with macros in Word and Excel, a bit too
hairy for me, unfortunately.

Ralph Fox

unread,
Jun 3, 2018, 5:50:19 PM6/3/18
to
Unfortunately, it is unlikely for a simple NZB converter of this type to do
what you want.

The NZB file format (invented by NewZBin) handles "joins" but not "megajoins".

* So ordinary joined files are recorded inside the NZB file with a single
Subject/Author/Date entry.
* But megajoined posts are recorded inside an NZB file as if they were
joined but not megajoined, with separate Subject/Author/Date records
for each individual part part01.rar, etc.


--
Kind regards
Ralph

Ralph Fox

unread,
Jun 3, 2018, 7:01:46 PM6/3/18
to
On Sun, 03 Jun 2018 18:38:36 +0100, Mandy Liefbowitz wrote:

> Next question, and I'm sure this one has been asked before, is there
> a way to set those lines of HTML as active links to the individual
> saved messages in a folder, directory or archive file?

1) Unfortunately the NZB to HTML converter does not know the file
which each message has been saved as, so it cannot generate
links to these files.

2) The NZB to HTML converter could set those lines as active news:
URL links. If news: URLs were associated with Agent in Windows,
this would open Agent which would try to find to the message in
an Agent folder.
a) Agent's setting "search for message URLs in" would need to
be set to "All folders", at "Tools >> Options >> URL and
MIME Settings >> URLs".
b) This may not work properly for joined messages. Joined
messages are joined from separate messages each with its
own message-ID. There is no way to guarantee that the one
message-ID which the converter picks to use in the news:
URL link is the same as the one message-ID which Agent
picked for the joined header.

> With threading?

Unfortunately the NZB file format (invented by NewZBin) contains no
threading information. So the NZB to HTML converter does not know
what the correct threading is.


> An *easy* way that means I don't have to do any work? I suspect it's
> within my abilities given the massive help you have already provided
> but it's far too hot here to do HTML programming if you have already
> done it all for us.


--
Kind regards
Ralph

Mandy Liefbowitz

unread,
Jun 3, 2018, 9:38:28 PM6/3/18
to
On Mon, 04 Jun 2018 11:01:35 +1200, Ralph Fox <-rf-nz-@-.invalid>
wrote:
Sir, I was only about 0.001% serious but *thank* *you* ever so very,
very much. You are a treasure of great price and so exceedingly nice
and helpful. As always.
{Hugs}
Mand.

Richard Nelson

unread,
Jun 4, 2018, 4:20:40 PM6/4/18
to
On Sun, 03 Jun 2018 18:00:17 +1200, Ralph Fox <-rf-nz-@-.invalid>
wrote:

>On Sat, 02 Jun 2018 08:50:23 -0700, Mr. Mike wrote:
>
>> Is there any way you can save the list of subjects from a Usenet
>> group as seen with Agent into a text file, or even an Excel file with
>> the various columns: Lines/Subject/Author/Date?
>
>
>Here is how to save the message list as an HTML file with the columns
>Lines/Subject/Author/Date.
>
>This works whether or not the bodies have been downloaded.
>This requires Agent 4.2 or later.
>


Very, very cool. Thank you Ralph. I assume you authored this? Kudos
to you!
0 new messages